AITopics | random forest regression

Collaborating Authors

random forest regression

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Artificial Intelligence for Cost-Aware Resource Prediction in Big Data Pipelines

Goyal, Harshit

arXiv.org Artificial IntelligenceOct-8-2025

Efficient resource allocation is a key challenge in modern cloud computing. Over-provisioning leads to unnecessary costs, while under-provisioning risks performance degradation and SLA violations. This work presents an artificial intelligence approach to predict resource utilization in big data pipelines using Random Forest regression. We preprocess the Google Borg cluster traces to clean, transform, and extract relevant features (CPU, memory, usage distributions). The model achieves high predictive accuracy (R Square = 0.99, MAE = 0.0048, RMSE = 0.137), capturing non-linear relationships between workload characteristics and resource utilization. Error analysis reveals impressive performance on small-to-medium jobs, with higher variance in rare large-scale jobs. These results demonstrate the potential of AI-driven prediction for cost-aware autoscaling in cloud environments, reducing unnecessary provisioning while safeguarding service quality.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2510.05127

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Data Science > Data Mining > Big Data (0.62)

Add feedback

A Cost-Effective Framework for Predicting Parking Availability Using Geospatial Data and Machine Learning

Bagosher, Madyan, Mustafa, Tala, Alsmirat, Mohammad, Al-Ali, Amal, Jawarneh, Isam Mashhour Al

arXiv.org Artificial IntelligenceAug-21-2025

As urban populations continue to grow, cities face numerous challenges in managing parking and determining occupancy. This issue is particularly pronounced in university campuses, where students need to find vacant parking spots quickly and conveniently during class timings. The limited availability of parking spaces on campuses underscores the necessity of implementing efficient systems to allocate vacant parking spots effectively. We propose a smart framework that integrates multiple data sources, including street maps, mobility, and meteorological data, through a spatial join operation to capture parking behavior and vehicle movement patterns over the span of 3 consecutive days with an hourly duration between 7AM till 3PM. The system will not require any sensing tools to be installed in the street or in the parking area to provide its services since all the data needed will be collected using location services. The framework will use the expected parking entrance and time to specify a suitable parking area. Several forecasting models, namely, Linear Regression, Support Vector Regression (SVR), Random Forest Regression (RFR), and Long Short-Term Memory (LSTM), are evaluated. Hyperparameter tuning was employed using grid search, and model performance is assessed using Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and Coefficient of Determination (R2). Random Forest Regression achieved the lowest RMSE of 0.142 and highest R2 of 0.582. However, given the time-series nature of the task, an LSTM model may perform better with additional data and longer timesteps.

artificial intelligence, deep learning, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2508.14125

Country:

Europe (0.47)
Asia > Middle East > UAE (0.15)
North America > United States > Texas (0.15)

Genre: Research Report > New Finding (0.68)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Agile Climate-Sensor Design and Calibration Algorithms Using Machine Learning: Experiments From Cape Point

Barrett, Travis, Mishra, Amit Kumar

arXiv.org Artificial IntelligenceMar-9-2025

In this paper, we describe the design of an inexpensive and agile climate sensor system which can be repurposed easily to measure various pollutants. We also propose the use of machine learning regression methods to calibrate CO2 data from this cost-effective sensing platform to a reference sensor at the South African Weather Service's Cape Point measurement facility. We show the performance of these methods and found that Random Forest Regression was the best in this scenario. This shows that these machine learning methods can be used to improve the performance of cost-effective sensor platforms and possibly extend the time between manual calibration of sensor networks.

cape point, platform, sensor, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/I2MTC53148.2023.10176000

2503.06777

Country:

North America > United States (0.35)
Africa > Malawi (0.14)
Africa > South Africa > Western Cape > Cape Town (0.06)

Genre: Research Report (0.40)

Industry: Government > Regional Government > North America Government > United States Government (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.31)

Add feedback

Personalized Prediction Models for Changes in Knee Pain among Patients with Osteoarthritis Participating in Supervised Exercise and Education

Rafiei, M., Das, S., Bakhtiari, M., Roos, E. M., Skou, S. T., Grønne, D. T., Baumbach, J., Baumbach, L.

arXiv.org Artificial IntelligenceOct-16-2024

Knee osteoarthritis (OA) is a widespread chronic condition that impairs mobility and diminishes quality of life. Despite the proven benefits of exercise therapy and patient education in managing the OA symptoms pain and functional limitations, these strategies are often underutilized. Personalized outcome prediction models can help motivate and engage patients, but the accuracy of existing models in predicting changes in knee pain remains insufficiently examined. To validate existing models and introduce a concise personalized model predicting changes in knee pain before to after participating in a supervised education and exercise therapy program (GLA:D) for knee OA patients. Our models use self-reported patient information and functional measures. To refine the number of variables, we evaluated the variable importance and applied clinical reasoning. We trained random forest regression models and compared the rate of true predictions of our models with those utilizing average values. We evaluated the performance of a full, continuous, and concise model including all 34, all 11 continuous, and the six most predictive variables respectively. All three models performed similarly and were comparable to the existing model, with R-squares of 0.31-0.32 and RMSEs of 18.65-18.85 - despite our increased sample size. Allowing a deviation of 15 VAS points from the true change in pain, our concise model and utilizing the average values estimated the change in pain at 58% and 51% correctly, respectively. Our supplementary analysis led to similar outcomes. Our concise personalized prediction model more accurately predicts changes in knee pain following the GLA:D program compared to average pain improvement values. Neither the increase in sample size nor the inclusion of additional variables improved previous models. To improve predictions, new variables beyond those in the GLA:D are required.

artificial intelligence, machine learning, prediction, (16 more...)

arXiv.org Artificial Intelligence

2410.12597

Country:

North America > United States > Massachusetts > Worcester County > Holden (0.04)
Europe > Denmark > Southern Denmark (0.04)
Europe > Portugal > Braga > Braga (0.04)
Europe > Germany > Hamburg (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Privacy-preserving federated prediction of pain intensity change based on multi-center survey data

Das, Supratim, Rafie, Mahdie, Kammer, Paula, Skou, Søren T., Grønne, Dorte T., Roos, Ewa M., Hajek, André, König, Hans-Helmut, Ullaha, Md Shihab, Probul, Niklas, Baumbacha, Jan, Baumbach, Linda

arXiv.org Artificial IntelligenceSep-12-2024

Background: Patient-reported survey data are used to train prognostic models aimed at improving healthcare. However, such data are typically available multi-centric and, for privacy reasons, cannot easily be centralized in one data repository. Models trained locally are less accurate, robust, and generalizable. We present and apply privacy-preserving federated machine learning techniques for prognostic model building, where local survey data never leaves the legally safe harbors of the medical centers. Methods: We used centralized, local, and federated learning techniques on two healthcare datasets (GLA:D data from the five health regions of Denmark and international SHARE data of 27 countries) to predict two different health outcomes. We compared linear regression, random forest regression, and random forest classification models trained on local data with those trained on the entire data in a centralized and in a federated fashion. Results: In GLA:D data, federated linear regression (R2 0.34, RMSE 18.2) and federated random forest regression (R2 0.34, RMSE 18.3) models outperform their local counterparts (i.e., R2 0.32, RMSE 18.6, R2 0.30, RMSE 18.8) with statistical significance. We also found that centralized models (R2 0.34, RMSE 18.2, R2 0.32, RMSE 18.5, respectively) did not perform significantly better than the federated models. In SHARE, the federated model (AC 0.78, AUROC: 0.71) and centralized model (AC 0.84, AUROC: 0.66) perform significantly better than the local models (AC: 0.74, AUROC: 0.69). Conclusion: Federated learning enables the training of prognostic models from multi-center surveys without compromising privacy and with only minimal or no compromise regarding model performance.

federated learning, federated model, local model, (14 more...)

arXiv.org Artificial Intelligence

2409.07997

Country:

Oceania > Australia (0.28)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Denmark > Southern Denmark (0.05)
(29 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.79)
Information Technology > Data Science > Data Mining > Big Data (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)

Add feedback

A Notion of Feature Importance by Decorrelation and Detection of Trends by Random Forest Regression

Gerstorfer, Yannick, Krieg, Lena, Hahn-Klimroth, Max

arXiv.org Artificial IntelligenceMar-2-2023

In many studies, we want to determine the influence of certain features on a dependent variable. More specifically, we are interested in the strength of the influence -- i.e., is the feature relevant? -- and, if so, how the feature influences the dependent variable. Recently, data-driven approaches such as \emph{random forest regression} have found their way into applications (Boulesteix et al., 2012). These models allow to directly derive measures of feature importance, which are a natural indicator of the strength of the influence. For the relevant features, the correlation or rank correlation between the feature and the dependent variable has typically been used to determine the nature of the influence. More recent methods, some of which can also measure interactions between features, are based on a modeling approach. In particular, when machine learning models are used, SHAP scores are a recent and prominent method to determine these trends (Lundberg et al., 2017). In this paper, we introduce a novel notion of feature importance based on the well-studied Gram-Schmidt decorrelation method. Furthermore, we propose two estimators for identifying trends in the data using random forest regression, the so-called absolute and relative transversal rate. We empirically compare the properties of our estimators with those of well-established estimators on a variety of synthetic and real-world datasets.

artificial intelligence, feature importance, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2303.01156

Country:

North America > United States > California (0.05)
Europe > Germany (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.87)

Add feedback

Predicting housing prices and analyzing real estate market in the Chicago suburbs using Machine Learning

Xu, Kevin, Nguyen, Hieu

arXiv.org Artificial IntelligenceOct-12-2022

The pricing of housing properties is determined by a variety of factors. However, post-pandemic markets have experienced volatility in the Chicago suburb area, which have affected house prices greatly. In this study, analysis was done on the Naperville/Bolingbrook real estate market to predict property prices based on these housing attributes through machine learning models, and to evaluate the effectiveness of such models in a volatile market space. Gathering data from Redfin, a real estate website, sales data from 2018 up until the summer season of 2022 were collected for research. By analyzing these sales in this range of time, we can also look at the state of the housing market and identify trends in price. For modeling the data, the models used were linear regression, support vector regression, decision tree regression, random forest regression, and XGBoost regression. To analyze results, comparison was made on the MAE, RMSE, and R-squared values for each model. It was found that the XGBoost model performs the best in predicting house prices despite the additional volatility sponsored by post-pandemic conditions. After modeling, Shapley Values (SHAP) were used to evaluate the weights of the variables in constructing models.

artificial intelligence, machine learning, regression, (15 more...)

arXiv.org Artificial Intelligence

2210.06261

Country:

North America > United States > Illinois > Cook County > Chicago (0.61)
North America > United States > Illinois > Will County > Naperville (0.25)
North America > United States > Connecticut > Tolland County > Storrs (0.14)

Genre: Research Report > New Finding (0.49)

Industry: Banking & Finance > Real Estate (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.72)

Add feedback

House Price Prediction using a Random Forest Classifier

#artificialintelligenceDec-21-2021, 13:57:07 GMT

In this blog post, I will use machine learning and Python for predicting house prices. I will use a Random Forest Classifier (in fact Random Forest regression). In the end, I will demonstrate my Random Forest Python algorithm! There is no law except the law that there is no law. Data Science is about discovering hidden patterns (laws) in your data.

house price prediction, random forest classifier, random forest regression, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Bayesian Sample Size Prediction for Online Activity

Richardson, Thomas, Liu, Yu, McQueen, James, Hains, Doug

arXiv.org Machine LearningNov-23-2021

In many contexts it is useful to predict the number of individuals in some population who will initiate a particular activity during a given period. For example, the number of users who will install a software update, the number of customers who will use a new feature on a website or who will participate in an A/B test. In practical settings, there is heterogeneity amongst individuals with regard to the distribution of time until they will initiate. For these reasons it is inappropriate to assume that the number of new individuals observed on successive days will be identically distributed. Given observations on the number of unique users participating in an initial period, we present a simple but novel Bayesian method for predicting the number of additional individuals who will subsequently participate during a subsequent period. We illustrate the performance of the method in predicting sample size in online experimentation.

customer, experiment, new customer, (14 more...)

arXiv.org Machine Learning

2111.12157

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
North America > United States (0.04)
Asia > Vietnam > Long An Province (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Machine Learning Basics: Random Forest Regression

#artificialintelligenceJul-18-2020, 07:35:16 GMT

Previously, I had explained the various Regression models such as Linear, Polynomial, Support Vector and Decision Tree Regression. In this article, we will go through the code for the application of Random Forest Regression which is an extension to the Decision Tree Regression implemented previously. The Decision Tree is an easily understood and interpreted algorithm and hence a single tree may not be enough for the model to learn the features from it. On the other hand, Random Forest is also a "Tree"-based algorithm that uses the qualities features of multiple Decision Trees for making decisions. Therefore, it can be referred to as a'Forest' of trees and hence the name "Random Forest".

artificial intelligence, machine learning, regression, (11 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback